AITopics

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Neural Information Processing SystemsFeb-12-2026, 09:03:40 GMT

ContrastiveIntrinsicControlforUnsupervised ReinforcementLearning

Unlikeknowledge-based anddata-basedalgorithms, competence-based algorithms simultaneously address both the exploration challenge as well as distilling the generated experience in the form of reusable skills.

intrinsic reward, machine learning, reinforcement learning, (15 more...)

Country:

North America > United States > California (0.04)
Europe > Italy > Sardinia (0.04)
Africa > Ethiopia > Addis Ababa > Addis Ababa (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Neural Information Processing SystemsFeb-11-2026, 04:57:19 GMT

a7667ee5d545a43d2f0fda98863c260e-Supplemental-Conference.pdf

agent, moss, quadruped, (16 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Neural Information Processing SystemsDec-25-2025, 12:07:52 GMT

Unsupervised Reinforcement Learning with Contrastive Intrinsic Control

We introduce Contrastive Intrinsic Control (CIC), an unsupervised reinforcement learning (RL) algorithm that maximizes the mutual information between state-transitions and latent skill vectors. CIC utilizes contrastive learning between state-transitions and skills vectors to learn behaviour embeddings and maximizes the entropy of these embeddings as an intrinsic reward to encourage behavioural diversity. We evaluate our algorithm on the Unsupervised RL Benchmark (URLB) in the asymptotic state-based setting, which consists of a long reward-free pre-training phase followed by a short adaptation phase to downstream tasks with extrinsic rewards. We find that CIC improves over prior exploration algorithms in terms of adaptation efficiency to downstream tasks on state-based URLB.

contrastive intrinsic control, name change, unsupervised reinforcement learning, (5 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.66)

Neural Information Processing SystemsOct-11-2025, 00:03:55 GMT

debf482a7dbdc401f9052dbe15702837-Paper-Conference.pdf

algorithm, downstream task, international conference, (15 more...)

Country:

North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe > Italy > Sardinia (0.04)
Africa > Ethiopia > Addis Ababa > Addis Ababa (0.04)

Genre: Research Report > New Finding (0.68)

Industry:

Education (0.93)
Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Robots (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.71)
(2 more...)

Xu, Jingkai, Nie, Xiangli

SPECI: Skill Prompts based Hierarchical Continual Imitation Learning for Robot Manipulation

arXiv.org Artificial IntelligenceApr-23-2025

Real-world robot manipulation in dynamic unstructured environments requires lifelong adaptability to evolving objects, scenes and tasks. Traditional imitation learning relies on static training paradigms, which are ill-suited for lifelong adaptation. Although Continual Imitation Learnin (CIL) enables incremental task adaptation while preserving learned knowledge, current CIL methods primarily overlook the intrinsic skill characteristics of robot manipulation or depend on manually defined and rigid skills, leading to suboptimal cross-task knowledge transfer. To address these issues, we propose Skill Prompts-based HiErarchical Continual Imitation Learning (SPECI), a novel end-to-end hierarchical CIL policy architecture for robot manipulation. The SPECI framework consists of a multimodal perception and fusion module for heterogeneous sensory information encoding, a high-level skill inference module for dynamic skill extraction and selection, and a low-level action execution module for precise action generation. To enable efficient knowledge transfer on both skill and task levels, SPECI performs continual implicit skill acquisition and reuse via an expandable skill codebook and an attention-driven skill selection mechanism. Furthermore, we introduce mode approximation to augment the last two modules with task-specific and task-sharing parameters, thereby enhancing task-level knowledge transfer. Extensive experiments on diverse manipulation task suites demonstrate that SPECI consistently outperforms state-of-the-art CIL methods across all evaluated metrics, revealing exceptional bidirectional knowledge transfer and superior overall performance.

artificial intelligence, knowledge transfer, learning, (14 more...)

2504.15561

Country: Asia > China (0.14)

Genre: Research Report (1.00)

Industry: Education (1.00)

Technology: Information Technology > Artificial Intelligence > Robots > Manipulation (1.00)

Neural Information Processing SystemsJan-19-2025, 02:49:07 GMT

Unsupervised Reinforcement Learning with Contrastive Intrinsic Control

We introduce Contrastive Intrinsic Control (CIC), an unsupervised reinforcement learning (RL) algorithm that maximizes the mutual information between state-transitions and latent skill vectors. CIC utilizes contrastive learning between state-transitions and skills vectors to learn behaviour embeddings and maximizes the entropy of these embeddings as an intrinsic reward to encourage behavioural diversity. We evaluate our algorithm on the Unsupervised RL Benchmark (URLB) in the asymptotic state-based setting, which consists of a long reward-free pre-training phase followed by a short adaptation phase to downstream tasks with extrinsic rewards. We find that CIC improves over prior exploration algorithms in terms of adaptation efficiency to downstream tasks on state-based URLB.

algorithm, contrastive intrinsic control, unsupervised reinforcement learning, (3 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

arXiv.org Artificial IntelligenceNov-28-2024

Integrating Functionalities To A System Via Autoencoder Hippocampus Network

Luo, Siwei

Integrating multiple functionalities into a system poses a fascinating challenge to the field of deep learning. While the precise mechanisms by which the brain encodes and decodes information, and learns diverse skills, remain elusive, memorization undoubtedly plays a pivotal role in this process. In this article, we delve into the implementation and application of an autoencoder-inspired hippocampus network in a multi-functional system. We propose an autoencoder-based memorization method for policy function's parameters. Specifically, the encoder of the autoencoder maps policy function's parameters to a skill vector, while the decoder retrieves the parameters via this skill vector. The policy function is dynamically adjusted tailored to corresponding tasks. Henceforth, a skill vectors graph neural network is employed to represent the homeomorphic topological structure of subtasks and manage subtasks execution.

neural network, policy function, reinforcement, (11 more...)

2412.09635

Country: Asia > China > Jiangxi Province > Nanchang (0.04)

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceOct-2-2024

Computational Teaching for Driving via Multi-Task Imitation Learning

Gopinath, Deepak, Cui, Xiongyi, DeCastro, Jonathan, Sumner, Emily, Costa, Jean, Yasuda, Hiroshi, Morgan, Allison, Dees, Laporsha, Chau, Sheryl, Leonard, John, Chen, Tiffany, Rosman, Guy, Balachandran, Avinash

Driving is a sensorimotor task that is done often, and requires a degree of competency that has to be taught. While daily driving is complex and safety critical, performance driving requires a higher degree of competency in handling the vehicle at high speeds and limits of stability and requires years of one-on-one instruction and practice to master. Although driving instructors can help drivers perform better and safer [1], their availability is limited and costly. Hence, there is a clear need for automated teaching which can help drivers improve at the population scale. Driving instructors, e.g. in performance track driving [2], rely on their expertise in the driving task and their inference of student's skill levels to effectively teach students of various skill levels and learning styles. Instructors can gauge their students' skill levels and estimate what a student might do in a given scenario to provide contextually-relevant verbal instructions to the student. For example, consider how an instructor in the passenger seat might instruct a student driver on the appropriate timing for braking or the lateral positioning of the car with respect to the racing line (the optimal minimum time path around a race course). The teacher's ability to judge whether the student can maintain the racing line or oversteer in a turn influences what instructions are provided. An automated teaching system for driving should be able to take in relevant vehicle context (pose and dynamics, map information, etc.) and other factors (eg., driver monitoring) as inputs and output appropriate teaching actions for the

dataset, instruction, prediction, (13 more...)

2410.01608

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > France (0.04)

Genre: Research Report (0.82)

Industry: Education > Educational Technology > Educational Software > Computer Based Training (0.89)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Artificial IntelligenceMar-21-2024

Emergent World Models and Latent Variable Estimation in Chess-Playing Language Models

Karvonen, Adam

Language models have shown unprecedented capabilities, sparking debate over the source of their performance. Is it merely the outcome of learning syntactic patterns and surface level statistics, or do they extract semantics and a world model from the text? Prior work by Li et al. investigated this by training a GPT model on synthetic, randomly generated Othello games and found that the model learned an internal representation of the board state. We extend this work into the more complex domain of chess, training on real games and investigating our model's internal representations using linear probes and contrastive activations. The model is given no a priori knowledge of the game and is solely trained on next character prediction, yet we find evidence of internal representations of board state. We validate these internal representations by using them to make interventions on the model's activations and edit its internal board state. Unlike Li et al's prior synthetic dataset approach, our analysis finds that the model also learns to estimate latent variables like player skill to better predict the next character. We derive a player skill vector and add it to the model, improving the model's win rate by up to 2.6 times.

activation, intervention, probe, (16 more...)